Skip to content

Port forwarding#8

Merged
Kitenite merged 4 commits into
mainfrom
port-forwarding
Oct 30, 2025
Merged

Port forwarding#8
Kitenite merged 4 commits into
mainfrom
port-forwarding

Conversation

@Kitenite
Copy link
Copy Markdown
Collaborator

Description

Related Issues

Type of Change

  • Bug fix
  • New feature
  • Documentation
  • Refactor
  • Other (please describe):

Testing

Screenshots (if applicable)

Additional Notes

@vercel
Copy link
Copy Markdown

vercel Bot commented Oct 30, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
superset-website Ready Ready Preview Comment Oct 30, 2025 2:49am

@Kitenite Kitenite merged commit 44783a4 into main Oct 30, 2025
4 of 7 checks passed
@Kitenite Kitenite deleted the port-forwarding branch November 11, 2025 18:18
Kitenite added a commit that referenced this pull request Apr 10, 2026
Create flow uses `git worktree add -b` directly. No try/catch
fallback to checkout. If branch exists despite dedup, the git
command errors and the create fails cleanly.

Checking out existing branches is a separate intent (createFromPr).
Updated decisions doc (decision #8).
Kitenite added a commit that referenced this pull request Apr 11, 2026
* feat(desktop): clone V1 new-workspace composer onto V2 modal

Replaces the tab-based V2 create-workspace modal with a clone of the
battle-tested V1 composer, rewiring only the backend boundaries.

Backend boundary changes (V1 → V2):
- Project list: electronTrpc.projects.getRecents → v2Projects +
  githubRepositories collections
- Branch list: electronTrpc.projects.getBranches* →
  workspaceCreation.searchBranches on host-service
- Create action: 4 V1 mutations (create/createFromPr/openTracked/
  openExternal) → single workspaceCreation.create on host-service
- GitHub issues/PRs list + content: electronTrpc.projects.{listIssues,
  listPullRequests, searchPullRequests, getIssueContent} →
  workspaceCreation.{searchGitHubIssues, searchPullRequests,
  getGitHubIssueContent} on host-service (Octokit via ctx.github())
- Navigation: navigateToWorkspace → navigateToV2Workspace

V2 additions:
- DevicePicker in the composer footer for host target selection; on
  host change, compareBaseBranch resets
- hostTarget field in the draft context

Intentionally dropped for Phase 1 (deferred to Phase 2):
- Branch prefix feature (projects.get, getGitAuthor, settings.
  getBranchPrefix, settings.getGitInfo, resolveBranchPrefix) — crosses
  V2 host boundary, needs host-aware prefix in Phase 2
- Worktree preflight UI (getExternalWorktrees, getWorktreesByProject,
  resolveOpenableWorktrees, worktree badges/filter tab) — host-service
  workspaceCreation.create handles tracked/external/adopt server-side

Host-service endpoints added:
- workspaceCreation.getContext — project + default branch
- workspaceCreation.searchBranches — git branches with hasWorkspace
- workspaceCreation.create — semantic create with outcome resolution
  (created_workspace / opened_existing_workspace / opened_worktree /
  adopted_external_worktree) and path-traversal guard on branchName
- workspaceCreation.searchGitHubIssues — Octokit issue list/search
- workspaceCreation.searchPullRequests — Octokit PR list/search
- workspaceCreation.getGitHubIssueContent — Octokit issue body fetch

* fix: pass organizationId to cloud API calls in workspaceCreation router

After merging main, v2Project.get and v2Workspace.create were converted
to jwtProcedure and now require organizationId in their input. The
host-service workspace.create endpoint was already updated in main, but
our new workspaceCreation router still had four call sites missing it.

* chore: add instrumentation to workspace creation flow

Temporary console.log at key points to debug "workspace created" toast
showing but workspace not appearing:
- useCreateDashboardWorkspace: log input + result
- PromptGroup: log create result + navigation
- workspaceCreation.create: log resolved names, each outcome path

* Update docs

* docs: clean up plan files, keep only final decisions + scenario analysis

Remove superseded docs (v2-create-fix-plan, v2-create-decisions,
v1-workspace-creation-logic). Keep v1-create-scenario-analysis as
V1 behavior reference and v2-create-decisions-final as the 13
implementation decisions.

* feat: implement v2 create decisions — always create, never collide

Host-service workspaceCreation.create rewrite:
- Strip collision detection (no opened_existing_workspace / opened_worktree
  / adopted_external_worktree). Create always creates.
- Sanitize + deduplicate branch name server-side. If branch exists,
  append -2, -3, etc. Renderer sends best-effort name.
- Simplified return: { workspace, warnings } — no outcome field.
- Add setup script execution (runs .superset/setup.sh in worktree,
  blocks until done, non-fatal on failure).
- Remove source and behavior fields from input schema.
- Add sanitize-branch utils (copied from shared/utils/branch.ts).

Renderer PromptGroup rewrite:
- Remove AI branch gen (electronTrpc.workspaces.generateBranchName) —
  boundary violation, needs V1 project ID.
- Renderer computes branch name: user-typed > prompt slug > random UUID.
- Renderer computes workspace name: user-typed > prompt > branch name.
- Single pending phase: "creating" (no "generating-branch" or "preparing").
- Remove buildLaunchRequest, hostUrl resolution, agentConfigsById —
  dead code after removing AI gen and collision paths.

Per decisions in apps/desktop/plans/v2-create-decisions-final.md.

* feat: draft stash for failure recovery

On create submit: snapshot the draft into a zustand atom before
closing the modal. On success: clear the stash. On failure: restore
the stash and reopen the modal so the user can retry with their
prompt, attachments, and linked context intact.

- Added stashedDraft / stashDraft / clearStashedDraft /
  restoreStashedDraft to the new-workspace-modal zustand store
- PromptGroup.handleCreate stashes before closeAndResetDraft,
  restores on catch
- DashboardNewWorkspaceModalContent applies stash on reopen via
  useEffect that reads from the zustand store
- Replaced runAsyncAction with direct try/catch for clearer
  success/failure handling

* chore: add logging + error handling to cloud API calls in create

The create flow was hanging silently when ensureV2Host or
v2Workspace.create failed (e.g. Neon DB schema mismatch). Now
both calls log before + catch errors explicitly, rollback the
worktree on failure, and throw with a descriptive message
instead of hanging.

* docs: finalize v2 workspace creation status + pending workspace design

- pendingWorkspaces local collection (localStorage-backed via @tanstack/react-db)
  holds full draft data for retry on failure
- Attachments stored as raw blobs in IndexedDB (no compression — files
  are already compressed, IndexedDB has no size limit)
- EventBus workspace:creating events for live step-by-step progress
- /v2-workspace/pending/$pendingId route shows progress, error + retry
- Sidebar renders pending workspaces as clickable skeletons
- Multiple concurrent creates supported
- Host-service returns initialCommands, setup dispatched to terminal pane

* docs: switch from EventBus to polling for create progress

- Host-service writes to in-memory Map during create mutation
- New getProgress query endpoint, pending page polls every 500ms
- PromptGroup owns the create promise (fire-and-forget, survives
  unmount), updates pendingWorkspaces collection on resolve/reject
- Pending page is purely a progress viewer, not an owner
- Sidebar reads collection via useLiveQuery, no polling
- Removed all EventBus references from the design

* docs: add TODO for cleaning up stale createProgress map entries

* chore: add TODO to migrate chat pane uploads to IndexedDB blob pattern

* chore: move IndexedDB migration TODO to ChatLaunchConfig where base64 blobs are stored

* feat: add create progress infrastructure

Host-service:
- In-memory progress map with TTL sweep for stale entries
- getProgress query endpoint (polled by pending page at 500ms)
- Create mutation writes step progress (ensuring_repo → creating_worktree → registering)
- Clears progress on success or failure
- Returns initialCommands instead of running execSync (setup dispatched to terminal pane)
- Accepts pendingId for progress correlation

Renderer:
- pendingWorkspaces local collection (localStorage-backed via @tanstack/react-db)
  with full draft data for retry on failure
- pending-attachment-store.ts: IndexedDB wrapper for attachment blobs
  (store/load/clear keyed by pendingId)
- useCreateDashboardWorkspace accepts + forwards pendingId

* feat: rewrite PromptGroup submit to use pendingWorkspaces collection + IndexedDB

- Insert full draft into pendingWorkspaces collection before closing modal
- Store attachment blobs in IndexedDB (keyed by pendingId/blobUUID)
- Navigate to /v2-workspace/pending/$pendingId immediately
- Fire-and-forget createWorkspace — closure survives modal unmount
- On success: update collection row to succeeded, clear IndexedDB blobs
- On failure: update collection row to failed (draft preserved for retry)
- Removed zustand stashedDraft/pendingWorkspace hooks (replaced by collection)
- Removed inline convertBlobUrlToDataUrl (moved to pending-attachment-store)

* feat: add pending workspace page with live progress polling

New route: /v2-workspace/pending/$pendingId

- Reads pending workspace from pendingWorkspaces collection via useLiveQuery
- Polls workspaceCreation.getProgress every 500ms for step-by-step progress
- Shows: workspace name, branch name, step checklist (ensuring_repo → creating_worktree → registering)
- On succeeded: auto-navigates to /v2-workspace/$workspaceId, cleans up pending row
- On failed: shows error message with Retry + Dismiss buttons
- Retry resets status to creating (full re-fire is a follow-up TODO)
- Dismiss deletes pending row + clears IndexedDB attachments

* feat: sidebar renders pending workspaces from collection, clickable

- Replace single zustand pendingWorkspace with useLiveQuery on
  pendingWorkspaces collection (supports multiple concurrent creates)
- Pending sidebar items are now clickable — navigate to
  /v2-workspace/pending/$pendingId to view progress
- Add "failed" status to creationStatus type across sidebar components
  (types, icon, collapsed button, status text helper)
- Succeeded pending workspaces are filtered out (replaced by real
  workspace from Electric sync)

* refactor: split PromptGroup into focused files

PromptGroup.tsx: 890 → 441 lines (UI render only)

Extracted:
- components/AttachmentButtons/ (74 lines)
- components/ProjectPickerPill/ (79 lines)
- components/CompareBaseBranchPicker/ (134 lines)
- hooks/useHandleCreate/ (181 lines) — full create orchestration
- types.ts (15 lines) — shared types + constants

Existing sub-components unchanged:
- GitHubIssueLinkCommand, LinkedGitHubIssuePill, LinkedPRPill, PRLinkCommand

* fix: sidebar pending workspace visual states

- Failed: red triangle icon + red "Failed" text
- Creating: keeps spinner + gray "Creating..." text

* fix: center pending workspace page layout

* fix: pending page centered on page, left-aligned text

* fix: pending page takes full width of content area

* refactor: host-service defines step labels, renderer just renders

Host-service getProgress now returns fully resolved steps with id,
label, and status (pending/active/done). The renderer maps over
the array directly — no STEP_LABELS, no STEP_ORDER, no string
matching. If the host-service adds/changes steps, the UI updates
automatically.

* fix: pending page top-aligned with padding, update dev seed scripts

* feat: elapsed timer + staleness detection on pending page

* fix: coerce createdAt to timestamp regardless of string/Date type

* fix: timer inline before status text

* fix: formatRelativeTime shows seconds under 1 minute

Was returning "now" for everything under 60s. Now returns "now" for
<5s, then "5s", "30s", etc. Pending page consumes it for the elapsed
timer instead of a custom formatter.

* chore: remove dev seed files and dead stash restore code

- Delete dev-seed-pending-workspace.ts and DevSeedPendingWorkspace.tsx
- Remove stash restore effect from DashboardNewWorkspaceModalContent
  (draft recovery now handled by pending page retry, not modal reopen)
- Remove DevSeedPendingWorkspace from layout
- Leave zustand store untouched (V1 still uses it)

* fix: move pending page out of v2-workspace layout

The pending page was under /v2-workspace/pending/$pendingId which
caused the v2-workspace layout to render bare <Outlet /> during
route transitions. This removed the WorkspaceTrpcProvider from the
tree while TerminalPane was still mounted → crash.

Moved to /_dashboard/pending/$pendingId — completely outside the
v2-workspace layout. The layout reverts to main's version unchanged.

* docs: add comment explaining why pending page lives outside v2-workspace

* chore: remove instrumentation console.logs from workspaceCreation.create

* fix: always show dismiss on creating page, add TODO on v2-workspace layout bug

- Dismiss button always visible on pending page (not just when stale)
- Added TODO comment on v2-workspace layout explaining the bare
  Outlet bug that strips WorkspaceTrpcProvider during transitions
- Removed instrumentation console.logs

* fix: always create new branch, never try checkout existing

Create flow uses `git worktree add -b` directly. No try/catch
fallback to checkout. If branch exists despite dedup, the git
command errors and the create fails cleanly.

Checking out existing branches is a separate intent (createFromPr).
Updated decisions doc (decision #8).

* feat: use friendly two-word names for fallback branch/workspace names

Replace workspace-${uuid} with friendly-words pattern (e.g.
"cheerful-umbrella"). Generate once, use for both branch name and
workspace name when user typed neither. New shared util at
shared/utils/friendly-branch-name.ts.

* feat: wire up retry from pending page, move timer to end

- Extract useRetryCreate hook that re-fires createWorkspace with
  draft data from the pending collection row + attachments from IndexedDB
- Retry button calls the hook instead of inline logic
- Move elapsed timer to right-aligned end of status line

* refactor: rename compareBaseBranch to baseBranch in V2 create flow

V1 uses "compareBaseBranch" because it serves double duty as the
git fork point AND the diff comparison base. For V2's create flow
it's just the fork point — "baseBranch" is clearer. The workspace
view can derive its own compare base independently.

Renamed in: draft context, PromptGroup, useHandleCreate,
useCreateDashboardWorkspace, host-service create input schema,
pendingWorkspace collection schema, pending page retry.
V1 code untouched.

* refactor: split branch name handling into slugifyForBranch + sanitizeUserBranchName

Two clearly different operations:
- slugifyForBranch: turns arbitrary text (prompts) into a branch slug.
  Lowercases, strips special chars, collapses spaces to dashes.
- sanitizeUserBranchName: strips only what git forbids from a
  user-typed branch name. Preserves case, slashes, underscores.

Host-service no longer sanitizes — it only validates (non-empty) and
deduplicates. The renderer owns all sanitization/slugification.
Removed sanitizeBranchNameWithMaxLength from host-service utils.

* refactor: rename useHandleCreate → useSubmitWorkspace, extract pure functions

- resolveNames(draft) — pure. Computes branch + workspace names from
  draft state. User-typed → sanitizeUserBranchName, prompt → slugifyForBranch,
  empty → friendlyBranchName.
- mapLinkedContext(draft) — pure. Maps linked issues/PR to API payload shape.
- useSubmitWorkspace(projectId) — the hook. Orchestrates resolve → store →
  insert → close → navigate → fire create. Calls the pure functions.

Dependency array simplified: just draft object instead of 12 individual fields.

* fix: dedup suffix can no longer exceed max branch length

Truncates base name upfront to 94 chars (reserving 6 for suffix
like -99999) before appending dedup suffixes. Strips trailing .-
after truncation. Last-resort fallback uses base36 timestamp
instead of decimal for shorter output.

* lint
Kitenite added a commit that referenced this pull request Apr 17, 2026
- host-service ai-branch-name: run trailing-trim after slice so a
  100-char truncation can't re-introduce a bare "." or "-" that git
  rejects as an invalid ref (coderabbit / cubic #2, #7).
- host-service workspace-creation.generateBranchName: reuse the
  existing listBranchNames helper instead of the inline git walk,
  which classified off the short refname and could conflate a local
  "origin/foo" with refs/remotes/origin/foo (coderabbit #3).
- packages/chat shared/small-model: drop the unused
  hasSmallModelCredentials export; only a test mock consumed it
  (greptile #4).
- resolveAnthropicCredential: on refresh failure, return null instead
  of kind:"oauth" with a stale expiresAt so callers fall back cleanly
  (cubic #8).
- chat-service.getAnthropicAuthStatus: log context when refresh throws
  instead of silently swallowing (cubic #9).
Kitenite added a commit that referenced this pull request Apr 18, 2026
…3517)

* remove 7 day rule

* Upgrade mastra

* upgrade ai

* Ad mastra

* refactor(desktop): remove dead provider-diagnostics plumbing

The provider-diagnostics store was fed by callSmallModel's per-attempt
reporting, which was removed when small-model tasks moved to direct AI-SDK
+ mastracode's AuthStorage. Nothing writes to the issue map anymore, so the
clearIssue mutation, getStatuses query, and diagnosticStatus plumbing in
ModelsSettings were all no-ops.

Settings still surfaces "Session expired / Reconnect" via auth-status alone.
ProviderIssue type collapsed from 8 codes to just "expired" to match.

* fix(auth): auto-refresh expired Anthropic OAuth tokens

Anthropic credentials were read via authStorage.get() everywhere, so
mastracode's built-in refresh flow never ran. Once the 1-hour access
token expired, status flipped to "Reconnect" and users had to do a
full PKCE re-auth, even though a valid refresh token was already
stored.

Resolvers now call authStorage.getApiKey() for oauth creds on expiry,
which triggers refreshToken() and persists the refreshed credential.
getAnthropicAuthStatus does the same before declaring issue: "expired".
Mirrors the pattern already used for OpenAI small-model auth.

* review: address PR feedback from cubic + coderabbit + greptile

- host-service ai-branch-name: run trailing-trim after slice so a
  100-char truncation can't re-introduce a bare "." or "-" that git
  rejects as an invalid ref (coderabbit / cubic #2, #7).
- host-service workspace-creation.generateBranchName: reuse the
  existing listBranchNames helper instead of the inline git walk,
  which classified off the short refname and could conflate a local
  "origin/foo" with refs/remotes/origin/foo (coderabbit #3).
- packages/chat shared/small-model: drop the unused
  hasSmallModelCredentials export; only a test mock consumed it
  (greptile #4).
- resolveAnthropicCredential: on refresh failure, return null instead
  of kind:"oauth" with a stale expiresAt so callers fall back cleanly
  (cubic #8).
- chat-service.getAnthropicAuthStatus: log context when refresh throws
  instead of silently swallowing (cubic #9).

* fix(chat): read auth.json directly instead of importing mastracode

Importing createAuthStorage from mastracode loads the entire CLI tree
(fastembed → onnxruntime-node's 208 MB native binary) via eager
top-level requires in mastracode's CJS entry. This crashed
electron-vite bundling and bloated the get-small-model chunk.

getSmallModel now reads mastracode's auth.json file directly using
the same path resolution logic (~/Library/Application Support/mastracode/
on macOS). Zero mastracode import, zero bundle impact. The chunk stays
at 1.2 MB (just @ai-sdk/anthropic + @ai-sdk/openai).

Production build verified: compile:app succeeds, Electron main process
boots with no onnxruntime error.

* docs(desktop): add manual testing plan for PR #3517

* fix api key storage slot

* fix(auth): store API keys in dedicated slot so OAuth doesn't clobber them

setApiKeyForProvider and setStoredAnthropicApiKeyFromEnvVariables now
use authStorage.setStoredApiKey() (writes to "apikey:<provider>")
instead of authStorage.set() (writes to the main "<provider>" slot
shared with OAuth). This way connecting/disconnecting OAuth doesn't
overwrite or delete a stored API key.

resolveAuthMethodForProvider falls back to hasStoredApiKey() after
checking the main slot, so status correctly reports authenticated
when only an API key is stored.

* fix(auth): backup/restore API keys across OAuth connect/disconnect

mastracode's resolveModel only reads API keys from the main
authStorage slot (authStorage.get("anthropic")). OAuth login
overwrites this slot, and disconnect removes it — losing any
previously saved API key.

Fix: backup the API key to the dedicated apikey: slot before OAuth
connect, restore it after disconnect. setApiKeyForProvider now writes
to both slots (main for resolveModel compatibility, apikey: for
backup). resolveAuthMethodForProvider checks both.

Applies to both Anthropic and OpenAI providers.

* chore: add upstream PR reference to auth workaround

Point to mastra-ai/mastra#15483 so the backup/restore code can be
removed once upstream lands and we bump mastracode.

* refactor(desktop): derive settings provider action from status

Replace the cascade of if/else + canDisconnect flag with a single
getProviderAction(status) → connect | reconnect | logout | null.
Fixes "Active" badge + "Connect" button showing simultaneously
when authenticated via API key.

* fix(desktop): always show Logout when provider is active

Active providers now always show a Logout button. Clears OAuth or
API key depending on authMethod — no more "Active" badge with no
way to disconnect.

* fix(desktop): simplify OpenAI OAuth dialog + auto-open browser

Match Anthropic dialog's layout: remove the raw OAuth URL display
and "Tip" block, auto-open the browser on OAuth start. Change
"Back" to "Cancel" for consistency.

* refactor(desktop): unify OAuth dialogs into shared OAuthDialog

Extract shared OAuthDialog component with provider config object.
AnthropicOAuthDialog and OpenAIOAuthDialog become thin wrappers
that pass provider-specific labels and options.

* fix(desktop): show 'Copied!' feedback on Copy URL button

* refactor(desktop): merge provider account + API key into single card

Each provider section now renders AccountCard + ConfigRow inside
one rounded card with a divider, instead of two separate cards.
Removes the standalone "API Keys" collapsible section.

* refactor(desktop): compact OAuth row in provider settings card

OAuth row is now a single inline row (label + status + action)
instead of a stacked AccountCard. Both providers share the same
2-row card layout: OAuth row + API key row with divider.

* fix(desktop): contextual buttons in provider settings

Connect is now primary (filled). Save only shows when there's input.
Clear only shows when a key is saved. Removes visual noise from
empty-state provider cards.

* ui(desktop): add provider icons to settings section headers

* ui(desktop): show 'Not connected' badge instead of subtitle for disconnected providers

* ui: remove redundant disconnected subtitle

* ui: remove subtitle text from OAuth rows

* chore: remove dead AccountCard + getProviderSubtitle

* docs: update test plan to match current UI

* chore: move shipped plans to done/

---------

Co-authored-by: AviPeltz <aj.peltz@gmail.com>
AviPeltz added a commit that referenced this pull request May 10, 2026
Resolves 11 findings from greptile + coderabbit review on the
remote-control feature:

- #1 (P1): `remoteControl.get` is now `publicProcedure` and accepts the
  raw token, hashing it for constant-time comparison against the row's
  `tokenHash`. Anonymous viewers can resolve `wsUrl` without a Superset
  session — the share link itself is the credential.
- #10 (Major): the host-side `sendInput` no longer round-trips bytes
  through a latin1 string before `pty.write` re-encodes them as UTF-8
  (which corrupted any byte ≥ 0x80). Adds `pty.writeBytes` that
  forwards a `Uint8Array` straight to the daemon.
- #2: a single `cleanup()` helper now handles `onClose` and `onError`,
  removing the viewer from the session's set, detaching the handle, and
  unsubscribing the revoke listener idempotently. Fixes a leak where
  abrupt teardown could orphan up to four `MAX_VIEWERS` slots until host
  restart.
- #8: client WebSocket payloads are validated via a zod discriminated
  union before dispatch; `resize` and `runCommand` are wrapped in
  try/catch like `input` was.
- #5: `TerminalRemoteControlButton` hydrates from
  `remoteControl.listForWorkspace` on mount and refreshes every 30s, so
  the live badge survives remounts and reflects backend revocation /
  expiry. The original `webUrl` is unrecoverable after `create` (the
  cloud only stores `tokenHash`), so Copy Link is disabled when we
  don't hold it.
- #3: handshake-time auth result is cached on the WS context; per-
  message handling just compares `expiresAt` against `now` instead of
  re-running HMAC + SHA-256 at 200/s/viewer.
- #4: the bearer token is now passed in the URL fragment
  (`#remoteControlToken=…`), not the query string. The fragment never
  reaches the server, never appears in `Referer` headers, and stays out
  of access logs and history. A new `RemoteTerminalLoader` client
  component reads `location.hash` after mount.
- #7: the web viewer writes a one-time dim hint into xterm when the
  user types in `command` mode so silent drops are explained.
- #9: oversized PTY chunks (> 256 KB in one event) now have their tail
  preserved instead of being pushed-and-immediately-shifted out of the
  ring, which would have left late-joining viewers with an empty
  snapshot.
- #11: host-side mintToken schema now `.min(MIN_TTL).max(MAX_TTL)`,
  matching `mintRemoteControlToken`'s internal clamp.
- #12: revoke `UPDATE` adds `organizationId` and `status='active'` to
  the `WHERE` so re-revoke is idempotent and cannot transition an
  `expired` row to `revoked`.

Skipped: #6 (relay replay/tunnel-ownership) — the existing host proxy
paths don't call `maybeReplay` either, so this PR doesn't regress the
single-region behavior. Multi-region replay is a broader gap tracked
separately.
AviPeltz added a commit that referenced this pull request May 11, 2026
* feat: browser-based remote control for v2 desktop terminals

Adds an end-to-end remote control flow: a desktop user clicks Share on a
v2 terminal pane, gets a one-time URL, and anyone opening the link in a
browser sees the live terminal output and can type into it.

Pieces:
- v2_remote_control_sessions table + remote_control_session_{mode,status}
  enums, with a unique index on token_hash. Cloud row stores the SHA-256
  hash only; the host-service mints and verifies the HMAC-signed token
  end-to-end so a leaked DB row cannot grant new sessions.
- Shared protocol module (packages/shared/remote-control-protocol)
  pinning the wire format, capabilities, and limits.
- Host-service: session manager (token mint/verify, registry, expiry
  sweep), /remote-control/:sessionId WS route, attachTerminalViewer fan-
  out on TerminalSession (256 KB tail ring, output sequence, viewer
  set), and terminal.remoteControl.{mintToken,revoke,listActive} tRPC.
- Cloud tRPC: remoteControl.{create,get,revoke,listForWorkspace,
  expireStale}; mints a short user JWT for the relay POST and inserts
  the row only after the host returns a token.
- Relay: lets /hosts/:hostId/remote-control/* skip the user-JWT gate;
  the per-session HMAC validated by the host is the credential.
- Desktop UI: TerminalRemoteControlButton on v2 terminal panes (radio
  icon -> live badge dropdown with copy/stop) wired through pane
  registry. Host-service spawn now propagates HOST_SERVICE_SECRET into
  the child env so the secret derivation is stable.
- Web viewer: /agents/remote-control/[sessionId] page + RemoteTerminal
  client component (xterm.js + addon-fit, mobile-only key toolbar,
  status badge, copy/stop controls).

Out of scope: viewer-count fan-out, host->cloud heartbeat for
last_connected_at, host-driven resize push, and rebuilding the in-
memory session registry across host-service restarts (viewers see
session-not-found and the desktop user re-shares).

* style(web): match desktop terminal theme in remote-control viewer

Pulls the xterm options and color palette from the desktop's default
"dark" (Ember) theme so the browser viewer renders identical font, font
size, scrollback, ANSI colors, cursor, and selection. The chrome (header,
buttons, banners, mobile toolbar) now uses the same warm-dark surface
tones (#151110/#1a1716/#2a2827) and matches the ANSI accent colors for
status badges and the destructive Stop button.

`vtExtensions` (kittyKeyboard) and `scrollbar` are intentionally omitted
because they only exist on the desktop's xterm beta build; the stable
web release does not type them.

* fix(remote-control): address PR review findings

Resolves 11 findings from greptile + coderabbit review on the
remote-control feature:

- #1 (P1): `remoteControl.get` is now `publicProcedure` and accepts the
  raw token, hashing it for constant-time comparison against the row's
  `tokenHash`. Anonymous viewers can resolve `wsUrl` without a Superset
  session — the share link itself is the credential.
- #10 (Major): the host-side `sendInput` no longer round-trips bytes
  through a latin1 string before `pty.write` re-encodes them as UTF-8
  (which corrupted any byte ≥ 0x80). Adds `pty.writeBytes` that
  forwards a `Uint8Array` straight to the daemon.
- #2: a single `cleanup()` helper now handles `onClose` and `onError`,
  removing the viewer from the session's set, detaching the handle, and
  unsubscribing the revoke listener idempotently. Fixes a leak where
  abrupt teardown could orphan up to four `MAX_VIEWERS` slots until host
  restart.
- #8: client WebSocket payloads are validated via a zod discriminated
  union before dispatch; `resize` and `runCommand` are wrapped in
  try/catch like `input` was.
- #5: `TerminalRemoteControlButton` hydrates from
  `remoteControl.listForWorkspace` on mount and refreshes every 30s, so
  the live badge survives remounts and reflects backend revocation /
  expiry. The original `webUrl` is unrecoverable after `create` (the
  cloud only stores `tokenHash`), so Copy Link is disabled when we
  don't hold it.
- #3: handshake-time auth result is cached on the WS context; per-
  message handling just compares `expiresAt` against `now` instead of
  re-running HMAC + SHA-256 at 200/s/viewer.
- #4: the bearer token is now passed in the URL fragment
  (`#remoteControlToken=…`), not the query string. The fragment never
  reaches the server, never appears in `Referer` headers, and stays out
  of access logs and history. A new `RemoteTerminalLoader` client
  component reads `location.hash` after mount.
- #7: the web viewer writes a one-time dim hint into xterm when the
  user types in `command` mode so silent drops are explained.
- #9: oversized PTY chunks (> 256 KB in one event) now have their tail
  preserved instead of being pushed-and-immediately-shifted out of the
  ring, which would have left late-joining viewers with an empty
  snapshot.
- #11: host-side mintToken schema now `.min(MIN_TTL).max(MAX_TTL)`,
  matching `mintRemoteControlToken`'s internal clamp.
- #12: revoke `UPDATE` adds `organizationId` and `status='active'` to
  the `WHERE` so re-revoke is idempotent and cannot transition an
  `expired` row to `revoked`.

Skipped: #6 (relay replay/tunnel-ownership) — the existing host proxy
paths don't call `maybeReplay` either, so this PR doesn't regress the
single-region behavior. Multi-region replay is a broader gap tracked
separately.

* fix(remote-control): address second-round PR review

Resolves four further findings on PR #4345:

- High: redact `remoteControlToken` (and any `token` query param) from the
  relay's request logger. The viewer has to put the bearer on the WS
  upgrade URL because browser WebSockets can't carry custom headers, and
  Hono's default `logger()` would otherwise spill the raw token into Fly
  logs / Sentry breadcrumbs.

- High: when the bypass for `/hosts/:hostId/remote-control/*` skips the
  user-JWT `authMiddleware`, still run the tunnel-presence check + Fly
  `maybeReplay`. Previously a viewer landing on a relay instance that
  doesn't own the destination tunnel would get a hard failure instead
  of a `fly-replay` to the right region/instance.

- Medium: the web viewer's Stop button now calls a new public
  `remoteControl.revokeWithToken({ sessionId, token })` mutation. The
  protected `revoke` requires a Superset session, which anonymous share
  recipients don't have, so the previous wiring silently 401'd. The
  bearer token IS the credential — anyone holding it has the same
  authority as whoever they got the link from. Constant-time SHA-256
  match against `tokenHash`, then revoke + best-effort host tear-down.

- Medium: `revoke` (protected) and `listForWorkspace` now require host
  membership in addition to active org membership, matching the gate
  on `create`. Otherwise an org member who isn't on the host could
  enumerate or revoke other people's share sessions.

The shared host-revoke flow is factored into `callHostRevoke()` so both
the protected and public revoke paths use the same best-effort tear-
down.

* fix(remote-control): make anonymous viewers actually work in prod

- Move /agents/remote-control/[sessionId] out of the `(agents)` route
  group into `(public)` so it skips `getAgentsUiAccess`, and add it to
  `proxy.ts` publicRoutes so unauthenticated viewers aren't redirected
  to /sign-in before the page mounts.
- Allow the relay WebSocket origin in the prod CSP `connect-src` so
  `wss://relay…` isn't blocked once `ws:/wss:` are dropped outside dev.
- Stop swallowing host-revoke failures: both revoke paths now throw a
  TRPCError if the host call fails, so the Stop button can't report
  success while in-memory host sessions (and connected viewers) live on.
  The cloud row still flips to `revoked` first, so retries stay
  idempotent and new attaches via `get` are blocked either way.

* fix(remote-control): keep bearer tokens out of URLs and harden cloud gates

- Convert `remoteControl.get` from a query to a mutation so tRPC's
  httpBatchLink puts the bearer token in the POST body instead of the
  URL query string (which would land in access logs and Referer). The
  web viewer now calls `.mutate()`.
- `get` refuses to hand out `wsUrl`/`routingKey` for non-active rows
  (revoked/expired). Also promotes active rows past `expiresAt` to
  `expired` even if the sweep hasn't run, so a just-expired share
  can't slip through. `SessionMeta.wsUrl` is now `string | null` and
  the WS-connect effect gates on it.
- Add `timeoutMs: 5000` to the mintToken relayMutation in `create`
  so a stuck host can't pin the Share button in "Starting…".
- Wrap the cloud `INSERT` in try/catch; on failure best-effort call
  `callHostRevoke` so the host-side minted token isn't orphaned and
  invisible to `listForWorkspace`/`revoke` until the host TTL sweep.

* chore(db): renumber remote-control migration to 0050 after main merge

Main shipped its own 0048 + 0049 while this branch had the original
0048_add_v2_remote_control_sessions. Regenerated via drizzle-kit
generate; same DDL, new slot.

* feat(remote-control): gate Share button on PostHog flag + add Open-in-browser

- New \`WEB_REMOTE_CONTROL_ACCESS\` (\`web-remote-control-access\`) feature
  flag controls who sees the Share button on v2 desktop terminal panes.
  Evaluated against the sharer's user id; the per-session HMAC stays the
  credential for anyone with the link, so this only gates session
  creation.
- TerminalRemoteControlButton returns null when the flag is off and skips
  the 30s cloud-hydrate poll for that user, so non-cohort users don't
  pay a tRPC round-trip every interval.
- Adds an "Open in browser" item to the live-badge dropdown that uses
  \`window.open(url, "_blank")\` (Electron routes it to the system
  browser), so the sharer can verify the link without leaving the app
  to paste it.

* refactor(remote-control): one-folder-per-component layout for web viewer

- Promote `RemoteTerminalLoader` to a sibling folder under `[sessionId]/
  components/RemoteTerminalLoader/` so `page.tsx`'s import resolves
  through a barrel that actually points at it (used to import the
  loader via the `RemoteTerminal` barrel, which was misleading).
- Extract inline `MobileToolbar` from `RemoteTerminal.tsx` into nested
  `RemoteTerminal/components/MobileToolbar/MobileToolbar.tsx` per the
  project's one-folder-per-component convention. No behavior change.

* fix(remote-control): close enumeration, schema leak, observability gaps

- Collapse `get` + `revokeWithToken` into a single generic 401
  ("Invalid remote control session or token") whether the row is
  missing or the token is wrong. Always runs the constant-time
  compare — against the row's tokenHash when present, otherwise
  against DUMMY_TOKEN_HASH — so timing and response both equalize
  and sessionIds can't be probed via these endpoints.
- Wrap the raw drizzle/pg error in `create`'s INSERT-failure path
  with TRPCError so pg constraint / column names don't get echoed
  back to clients through the default tRPC serializer.
- Upgrade the orphan-cleanup swallow from `console.warn` to a
  structured `console.error("[remote-control:orphan-host-session]",
  { sessionId, hostId, organizationId, insertError, revokeError })`
  so log scrapers can alert and a future Sentry
  `captureConsoleIntegration` picks it up automatically.

* docs(remote-control): plan for remaining work

* fix(remote-control): debounce viewer resize broadcasts via requestAnimationFrame

ResizeObserver can fire at refresh-rate (~60Hz) during a window-drag,
which immediately trips the host's REMOTE_CONTROL_RESIZE_RATE_PER_SEC
= 10 rate limit and surfaces a spurious "rate-limited" banner in the
viewer during normal use. Coalesce to one fit+broadcast per animation
frame and cancel any pending frame in the cleanup so we don't
fit+send after the WS is gone.

* fix(remote-control): trailing-debounce resize broadcasts to fix rate-limit banner

The RAF coalescing from the previous commit didn't actually help —
ResizeObserver already fires once per layout, so RAF was a no-op for
sustained window-drag, and the host's 10/s bucket still drained in
~166ms and tripped the "rate-limited" error.

Switch to a trailing 200ms debounce: keep calling `fit()` per event so
local rendering stays responsive, but only fire the host-side resize
message once after the user stops dragging. 5 Hz worst case, well
under the host's 10/s cap.

* ci(sherif): allow @xterm/{xterm,addon-fit} version split

Desktop's terminal-runtime uses the @xterm beta 6.x track for
kittyKeyboard + scrollbar options; the new web remote-control viewer
uses stable 5.x because the beta is too churny for a fresh feature.
The split is intentional and documented in the viewer's source, so
tell Sherif to ignore those two deps rather than force-aligning to
either track.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant